Body Fat Percentage Presentation

Group 013E01

Joanne Lim, John Fu, Haoyuan Gao, Zhenchen Yi

Data Description

  • Accurately body fat measurement is vital but often inconvenient and expensive
  • Can we use simpler measurements like height, weight, and circumferences for estimates?
  • The data set bodyfat contains body fat percentages and other related measurements for 250 men. These measurements, which include height, weight and various body circumferences, were collected to explore alternatives to underwater body fat assessments.
Rows: 250
Columns: 16
$ Density <dbl> 1.0708, 1.0853, 1.0414, 1.0751, 1.0340, 1.0502, 1.0549, 1.0704…
$ Pct.BF  <dbl> 12.3, 6.1, 25.3, 10.4, 28.7, 20.9, 19.2, 12.4, 4.1, 11.7, 7.1,…
$ Age     <int> 23, 22, 22, 26, 24, 24, 26, 25, 25, 23, 26, 27, 32, 30, 35, 35…
$ Weight  <dbl> 154.25, 173.25, 154.00, 184.75, 184.25, 210.25, 181.00, 176.00…
$ Height  <dbl> 67.75, 72.25, 66.25, 72.25, 71.25, 74.75, 69.75, 72.50, 74.00,…
$ Neck    <dbl> 36.2, 38.5, 34.0, 37.4, 34.4, 39.0, 36.4, 37.8, 38.1, 42.1, 38…
$ Chest   <dbl> 93.1, 93.6, 95.8, 101.8, 97.3, 104.5, 105.1, 99.6, 100.9, 99.6…
$ Abdomen <dbl> 85.2, 83.0, 87.9, 86.4, 100.0, 94.4, 90.7, 88.5, 82.5, 88.6, 8…
$ Waist   <dbl> 33.54331, 32.67717, 34.60630, 34.01575, 39.37008, 37.16535, 35…
$ Hip     <dbl> 94.5, 98.7, 99.2, 101.2, 101.9, 107.8, 100.3, 97.1, 99.9, 104.…
$ Thigh   <dbl> 59.0, 58.7, 59.6, 60.1, 63.2, 66.0, 58.4, 60.0, 62.9, 63.1, 59…
$ Knee    <dbl> 37.3, 37.3, 38.9, 37.3, 42.2, 42.0, 38.3, 39.4, 38.3, 41.7, 39…
$ Ankle   <dbl> 21.9, 23.4, 24.0, 22.8, 24.0, 25.6, 22.9, 23.2, 23.8, 25.0, 25…
$ Bicep   <dbl> 32.0, 30.5, 28.8, 32.4, 32.2, 35.7, 31.9, 30.5, 35.9, 35.6, 32…
$ Forearm <dbl> 27.4, 28.9, 25.2, 29.4, 27.7, 30.6, 27.8, 29.0, 31.1, 30.0, 29…
$ Wrist   <dbl> 17.1, 18.2, 16.6, 18.2, 17.7, 18.8, 17.7, 18.8, 18.2, 19.2, 18…

Check Linearity

Null and Full Model


Call:
lm(formula = Pct.BF ~ ., data = bodyfat)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.3746 -0.3725 -0.1157  0.2358 15.0629 

Coefficients: (1 not defined because of singularities)
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.494e+02  1.154e+01  38.961   <2e-16 ***
Density     -4.098e+02  8.384e+00 -48.876   <2e-16 ***
Age          1.395e-02  9.721e-03   1.435    0.153    
Weight       1.527e-02  2.015e-02   0.758    0.449    
Height      -1.558e-02  5.752e-02  -0.271    0.787    
Neck        -1.653e-02  7.084e-02  -0.233    0.816    
Chest        1.790e-02  3.259e-02   0.549    0.583    
Abdomen      1.833e-02  3.286e-02   0.558    0.578    
Waist               NA         NA      NA       NA    
Hip          2.537e-02  4.391e-02   0.578    0.564    
Thigh       -2.107e-02  4.421e-02  -0.476    0.634    
Knee        -1.657e-02  7.366e-02  -0.225    0.822    
Ankle       -8.160e-02  6.616e-02  -1.233    0.219    
Bicep       -5.256e-02  5.132e-02  -1.024    0.307    
Forearm      1.405e-02  6.229e-02   0.225    0.822    
Wrist       -1.883e-02  1.640e-01  -0.115    0.909    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.276 on 235 degrees of freedom
Multiple R-squared:  0.9777,    Adjusted R-squared:  0.9763 
F-statistic: 734.4 on 14 and 235 DF,  p-value: < 2.2e-16

Backward and Forward stepwise selection


Call:
lm(formula = Pct.BF ~ Density + Age + Abdomen, data = bodyfat)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.2913 -0.3576 -0.0911  0.2319 15.4601 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.424e+02  8.738e+00  50.626  < 2e-16 ***
Density     -4.065e+02  7.279e+00 -55.844  < 2e-16 ***
Age          1.182e-02  6.579e-03   1.796   0.0737 .  
Abdomen      5.761e-02  1.332e-02   4.326 2.21e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.26 on 246 degrees of freedom
Multiple R-squared:  0.9772,    Adjusted R-squared:  0.9769 
F-statistic:  3513 on 3 and 246 DF,  p-value: < 2.2e-16

Call:
lm(formula = Pct.BF ~ Density + Abdomen + Age, data = bodyfat)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.2913 -0.3576 -0.0911  0.2319 15.4601 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.424e+02  8.738e+00  50.626  < 2e-16 ***
Density     -4.065e+02  7.279e+00 -55.844  < 2e-16 ***
Abdomen      5.761e-02  1.332e-02   4.326 2.21e-05 ***
Age          1.182e-02  6.579e-03   1.796   0.0737 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.26 on 246 degrees of freedom
Multiple R-squared:  0.9772,    Adjusted R-squared:  0.9769 
F-statistic:  3513 on 3 and 246 DF,  p-value: < 2.2e-16

Call:
lm(formula = log(Pct.BF) ~ Density + Age + Weight + Height + 
    Neck + Chest + Abdomen + Waist + Hip + Thigh + Knee + Ankle + 
    Bicep + Forearm + Wrist, data = bodyfat)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.83722 -0.06872  0.03803  0.10425  1.10099 

Coefficients: (1 not defined because of singularities)
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  30.327682   1.893223  16.019  < 2e-16 ***
Density     -28.194739   1.390322 -20.279  < 2e-16 ***
Age           0.003161   0.001595   1.982  0.04865 *  
Weight       -0.002840   0.003325  -0.854  0.39396    
Height        0.028504   0.009489   3.004  0.00296 ** 
Neck         -0.004012   0.011622  -0.345  0.73027    
Chest         0.002934   0.005412   0.542  0.58830    
Abdomen      -0.002992   0.005408  -0.553  0.58062    
Waist               NA         NA      NA       NA    
Hip          -0.007137   0.007208  -0.990  0.32311    
Thigh         0.018438   0.007295   2.528  0.01215 *  
Knee          0.001581   0.012085   0.131  0.89603    
Ankle        -0.011140   0.010860  -1.026  0.30605    
Bicep         0.006681   0.008440   0.792  0.42940    
Forearm      -0.001098   0.010233  -0.107  0.91463    
Wrist         0.021332   0.026949   0.792  0.42941    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2094 on 234 degrees of freedom
Multiple R-squared:  0.863, Adjusted R-squared:  0.8548 
F-statistic: 105.3 on 14 and 234 DF,  p-value: < 2.2e-16

Summary of all 5 models

  • We observed that the relationship between Neck, Abdomen, Bicep, Forearm, Ankle, and Pct.BF has a non-linear relationship, so we will transform these variables when we perform linear-log and log-log models.
  Full Backward Forward Log-Linear Linear-Log Log-Log
Predictors Estimates p Estimates p Estimates p Estimates p Estimates p Estimates p
(Intercept) 449.43 <0.001 442.38 <0.001 442.38 <0.001 30.33 <0.001 446.75 <0.001 -9.57 0.136
Density -409.76 <0.001 -406.49 <0.001 -406.49 <0.001 -28.19 <0.001 -413.59 <0.001 -27.40 <0.001
Age 0.01 0.153 0.01 0.074 0.01 0.074 0.00 0.049 0.01 0.152 0.00 0.080
Weight 0.02 0.449 -0.00 0.394 0.01 0.647 0.00 0.435
Height -0.02 0.787 0.03 0.003 -0.00 0.988 0.01 0.342
Neck -0.02 0.816 -0.00 0.730
Chest 0.02 0.583 0.00 0.588 0.03 0.286 0.00 0.588
Abdomen 0.02 0.578 0.06 <0.001 0.06 <0.001 -0.00 0.581
Hip 0.03 0.564 -0.01 0.323 0.03 0.462 -0.00 0.747
Thigh -0.02 0.634 0.02 0.012 -0.01 0.842 0.01 0.190
Knee -0.02 0.822 0.00 0.896 -0.02 0.831 -0.01 0.522
Ankle -0.08 0.219 -0.01 0.306
Bicep -0.05 0.307 0.01 0.429
Forearm 0.01 0.822 -0.00 0.915
Wrist -0.02 0.909 0.02 0.429 0.02 0.888 0.02 0.441
Neck [log] -0.88 0.737 -0.60 0.135
Abdomen [log] 4.00 0.738 12.41 <0.001
Waist -0.08 0.805 -0.35 <0.001
Ankle [log] -2.12 0.207 -0.42 0.101
Bicep [log] -1.96 0.231 0.07 0.774
Forearm [log] 0.72 0.670 -0.14 0.599
Observations 250 250 250 249 249 249
R2 / R2 adjusted 0.978 / 0.976 0.977 / 0.977 0.977 / 0.977 0.863 / 0.855 0.978 / 0.977 0.886 / 0.879
AIC 847.962 831.066 831.066 1353.555 836.172 1308.930

Model selection conclusion

  • Generally, higher values of R-squared are better, but a very high R-squared could suggest overfitting, especially if it is much higher than the adjusted R-squared.
  • Lower AIC values indicate a better-fitting model.
  • We can conclude from the table, backward selection model will be the most appropriate one as it has the lowest AIC values and its R-squared and adjusted R-squared is the same.